From Norms to Neuropsychopathology: Exploring Neuroanatomical & Neuropsychiatric Variation in Alzheimer’s Disease Through Normative Modeling


This study extends analyses conducted by Verdi et al. (2023), who utilized normative modeling techniques to delineate neuroanatomical heterogeneity in Alzheimer’s disease. We employ a similar methodological framework to supplement these insights by integrating both structural MRI data and neuropsychiatric symptom profiles. This integration allowed us to explore more deeply the neuroanatomical and neuropsychiatric underpinnings of Alzheimer’s disease across different phenotypes within our ADNI subset. This study was conducted by the Applied Neuroimaging Statistics Ressearch Laboratory at McLean Hospital & Harvard Medical School.

Author: Adrian Medina

Date: October 1, 2024

Project Overview

Analytic Background:
Our study data were a subset derived from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) ADNI3 wave data bank. This was due to the harmonization of scanner sequence protocols across data collection sites that began during this wave. We only included subjects that had both structural MRI and PET (amyloid & tau) data collected, followed by QA of these data. This analysis leveraged a normative model developed by the Predictive Clinical Neuroscience Group at the Donders Institute and Radboud UMC, which aims to predict and stratify brain disorders on the basis of neuroimaging data. Specifically, we used ‘HBR_lifespan_36K_79sites’, which makes use of the Hierarchical Bayesian Regression algorithm trained on 37,128 subjects from 79 different collection sites, across the human lifespan.

Considerations:

  • Ideally, both adaptation and testing sets would be balanced by age, sex, and site (covariates) following something like a 60/40 or 70/30 split of healthy controls. However, given our limited sample size, we decided to keep all of our healthy control and patient data isolated.
    • In our analysis: The adaptation set is used to calibrate for site (only healthy controls) while the testing set is used exclusively for patient-phenotyped data.
      • Healthy control phenotypes include ‘A-C-’ (amyloid-tau negative, cognitive impairment negative) & ‘A+C-’ (amyloid-tau positive, cognitive impairment negative).
      • Patient-phenotypes include ‘A+C+’ (amyloid-tau positive, cognitive impairment positive) & ‘A-C+’ (amyloid-tau negative, cognitive impairment positive).
  • As a consequence of simultaneous conditions, a smaller sample size & larger site numbers (59 sites), our group elected to utilize the MRI manufacturer (3 total) of the subject’s imaging data to act as a pseudo ‘site’ variable thus giving more power to viably calibrate for potential ‘site’ influences.
    • ‘1’ = ‘GE’
    • ‘2’ = ‘Philips’
    • ‘3’ = ‘Siemens’

Code Workflow

  1. Data Frame Initialization: Install/load packages needed to run any/all chunks in this code file, and load/merge dataset(s) with imaging &/or non-imaging variables of interest. The finalized, merged file is what will be used to create adaptation-testing sets in the next step.
    • Note: Some manual edits may be needed to match your imaging variables to the column headers listed in the normative model CSV template (listed in the Normative Modeling GUI).
  2. Covariate Distributions, Data Splitting, & Balance Verification: Visualize distributions of the covariates of interest prior to splitting your dataset. Produce the adaptation and testing datasets needed to compute deviation scores of subject neuroimaging (structural MRI) data. Visualize the proportions of covariates of interest following the adaptation-testing split to verify balance between the two sets.
    • Note: Upon completion of this step, you are then able to run your sets through the normative model deviation scores computation via either the Normative Modeling GUI or Braincharts GUI (if you want to recreate the Python code yourself).
  3. Analytic Data Preparation: Preparing analytical data for both group comparisons of cortical thickness and broader statistical analyses. Loading and cleaning study data from its CSV file, which includes renaming columns for clarity, re-coding categorical variables such as biomarkers and sites, and setting appropriate factor levels. For analyses excluding group cortical thickness comparisons, the procedure extends to managing deviation scores, generating dummy variables, eliminating unnecessary columns, and ensuring the data’s integrity for modeling. Additionally, the Destrieux atlas data is loaded from the ggseg package, with preliminary visualizations executed to confirm the structure and detail of the data. This includes plotting atlas dimensions and displaying labeled regions to verify data accuracy. These meticulous preparations ensure the data is aptly formatted and enhanced, ready for subsequent in-depth statistical analysis and visualization tasks.
  4. Demographic Descriptives and Statistical Analysis of Cortical Thickness: Perform statistical analyses to understand the demographic distribution and cortical thickness variations across different biomarker groups. Calculate the counts for biomarkers, sex, and site categories and provides descriptive statistics for mean cortical thickness and age by biomarker group. Visualizations such as violin plots and bar charts are generated to display the distribution of age and sex across biomarker groups. Modeling cortical thickness variations, fitting linear models to predict mean cortical thickness based solely on biomarker groups and then incorporating additional covariates like age and sex. Detail the model outcomes with confidence intervals, performs Tukey HSD tests to scrutinize mean differences across groups, and visualizes these differences and their statistical significance. Visualize adjusted models’ predictions to provide a comprehensive understanding of cortical thickness influences.
  5. Regional Analyses: Across-Groups & Pairwise Comparisons: Apply the suffix ‘_rois’ to all ROI FreeSurfer measures in your dataframe, perform ANOVA analyses to extract p-values for each ROI across biomarker groups, and run FDR corrections. Further, derive F-stat values, merge and organize results for visualization, and display a heatmap of the significant F-stat values, annotated with significance levels. Remove the ‘_rois’ suffix from your ROI list and ensure that ROI names in your dataset match the Destrieux atlas nomenclature by extensively modifying and standardizing ROI names for both left and right hemispheres, then verifying and assigning corrected names in your dataframe. Visualize FDR-corrected p-values using ggseg for both left and right hemispheres, employing color gradients to represent significance levels, and save the combined plots as a PDF. Prepare data subsets for pairwise t-tests between different biomarker groups and perform these tests to extract statistical measures such as t-values, p-values, and FDR corrections, then visualize significant results using a lollipop plot to highlight the differences in brain regions across comparisons. Transform ROI names in your dataset to match the Destrieux atlas nomenclature by applying a function that renames based on hemisphere specifications and then ensures consistency with atlas data, using pattern replacements and validation against the atlas’s region list. Visualize and compare the significant FDR-corrected p-values across different biomarker groups (Control vs. Alzheimer’s, Control vs. MCI, and MCI vs. Alzheimer’s) for both hemispheres using ggseg maps, adjust the color gradients to highlight significance levels, and then compile all plots into a PDF for detailed analysis.
  6. Session Information: To enhance reproducibility, details about the working environment used for these analyses can be found below.

Data Frame Initialization

Code
# Set CRAN repository for consistent package installation, ggseg for neuroimaging visualizations
options(repos = c(
  ggseg = 'https://ggseg.r-universe.dev',
  CRAN = 'https://cloud.r-project.org'))

# Install and load the pacman package for efficient package management
if (!require(pacman)) install.packages("pacman")
library(pacman)

# Use p_load function to install (if necessary) and load packages
p_load(dplyr, knitr, kableExtra, psych, tidyr, readr, stringr, matrixStats, ggseg, ggseg3d, 
       ggsegExtra, ggsegDesterieux, cowplot, data.table, e1071, ggplot2, plot.matrix, proxy, 
       RPMG, broom, gridExtra, patchwork, caret, tidyverse, fastDummies, sjPlot, ggbeeswarm, 
       lavaan, caTools, bitops, effects, ggeffects, reshape2, ggpubr)

# Specify the 'base_path' where you can find your data files, ASSUMING they're in the same directory, & set it as WD
base_path <- "~/Downloads/ADNI_NormativeModeling"
setwd(base_path)

# Read in foundational datasets
NIDP_DX_Dem_NP <- read_csv("NIDP_DX_Dem_NP_MRI.csv") # Data set with clinical/non-imaging variables and MRI manufacturer information per subject assuming separate from FreeSurfer dataset
ADNI_freesurfer <- read_csv("ADNI_lh_rh_thickness_subcort_volumes.csv") # Data set with FreeSurfer brain thickness values

# Merging datasets on 'Subject_ID' with only ID values present in NIDP_DX_Dem_NP
ADNI_274_Merged <- merge(NIDP_DX_Dem_NP, ADNI_freesurfer, by = "Subject_ID", all.x = TRUE)

# Save the newly merged dataset as a CSV file
write_csv(ADNI_274_Merged, "ADNI_274_Merged.csv")

### Manual edits were done to this spreadsheet to only leave FreeSurfer values and balancing variables for data split
### FreeSurfer variables were edited to match PCN template, see GUI for details
### Covariates: 'age', 'sex', 'site'
# Rename finalized, merged file to 'ADNI_274_Merged_Final.csv'
ADNI_274_Merged_Final <- read_csv("ADNI_274_Merged_Final.csv") # This should be the spreadsheet that will be split into adaptation and testing sets for normative modeling

Covariate Distributions, Data Splitting, & Balance Verification

Code
# Create a bar plot for MRI manufacturers with counts displayed
ggplot(NIDP_DX_Dem_NP, aes(x = MRI_MANU)) +
  geom_bar(stat = "count", fill = "skyblue", color = "black") +
  geom_text(stat = "count", aes(label = ..count..), vjust = -0.5) +
  labs(title = "Count of MRI Manufacturers", x = "MRI Manufacturer", y = "Count") +
  theme_minimal()

Code
# Create separate bar plots for each manufacturer
ggplot(NIDP_DX_Dem_NP, aes(x = MRI_MAKE, fill = MRI_MANU)) +
  geom_bar(position = "dodge") +
  geom_text(aes(label = ..count..), stat = "count", position = position_dodge(width = 0.9), vjust = -0.5) +
  facet_wrap(~MRI_MANU, scales = "free_x") +
  labs(title = "Counts of Each MRI Make Grouped by Manufacturer", x = "MRI Make", y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels for better visibility

Code
# Counting subjects by BioMarkers within each site
bio_markers_site_counts <- ADNI_274_Merged_Final %>%
  group_by(site, BioMarkers) %>%
  dplyr::summarise(Count = n(), .groups = 'drop')

# Plot with Site on x-axis
plot_site_x <- ggplot(bio_markers_site_counts, aes(x = site, y = Count, fill = BioMarkers)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  geom_text(aes(label = Count), vjust = -0.5, position = position_dodge(width = 0.9)) +
  labs(title = "Subject Counts by BioMarkers Across Sites",
       x = "Site",
       y = "Number of Subjects",
       fill = "BioMarkers") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 0, hjust = 1))

# Plot with BioMarkers on x-axis
plot_biomarkers_x <- ggplot(bio_markers_site_counts, aes(x = BioMarkers, y = Count, fill = factor(site))) +
  geom_bar(stat = "identity", position = position_dodge()) +  # Use position_dodge() for proper grouping
  geom_text(aes(label = Count), vjust = -0.5, position = position_dodge(width = 0.9)) +  # Adjust text positioning
  labs(title = "Subject Counts Across BioMarkers by Site",
       x = "BioMarkers",
       y = "Number of Subjects",
       fill = "Site") +
  scale_fill_brewer(palette = "Set1") +  # Using a qualitative palette for distinct sites
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 0, hjust = 1))  # Improve readability of x-axis labels


# Optionally, print both plots to view them in the R environment
print(plot_site_x)

Code
print(plot_biomarkers_x)

Code
# Separate the data based on 'BioMarkers'
adaptation_data <- ADNI_274_Merged_Final %>%
  filter(BioMarkers %in% c("A-C-", "A+C-"))

testing_data <- ADNI_274_Merged_Final %>%
  filter(BioMarkers %in% c("A-C+", "A+C+"))

# Save the adaptation and testing datasets as CSV files
write_csv(adaptation_data, "ADNI_175_Adaptation.csv")
write_csv(testing_data, "ADNI_99_Testing.csv")

# After verifying that the grouping variable and covariates are balanced as desired (i.e., see the next chunk), you can now run them through the normative model computation via the GUI or recreate their code yourself!
Code
# Plotting the distribution of 'BioMarkers' in both datasets
biomarker_distribution <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset, BioMarkers) %>%
  dplyr::summarise(Count = n(), .groups = 'drop') %>%
  group_by(dataset) %>%
  mutate(Percentage = (Count / sum(Count)) * 100)

ggplot(biomarker_distribution, aes(x = BioMarkers, y = Percentage, fill = dataset)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Distribution of Biomarker", x = "Biomarker", y = "Percentage (%)") +
  theme_minimal()

Code
# Plotting the distribution of 'site' in both datasets
site_distribution <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset, site) %>%
  dplyr::summarise(Count = n(), .groups = 'drop') %>%
  group_by(dataset) %>%
  mutate(Percentage = (Count / sum(Count)) * 100)

ggplot(site_distribution, aes(x = site, y = Percentage, fill = dataset)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Distribution of Site", x = "Site", y = "Percentage (%)") +
  theme_minimal()

Code
# Plotting the distribution of 'sex' in both datasets
sex_distribution <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset, sex) %>%
  dplyr::summarise(Count = n(), .groups = 'drop') %>%
  group_by(dataset) %>%
  mutate(Percentage = (Count / sum(Count)) * 100)

ggplot(sex_distribution, aes(x = sex, y = Percentage, fill = dataset)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Distribution of Sex", x = "Sex", y = "Percentage (%)") +
  theme_minimal()

Code
# Calculating and comparing the average age
age_summary <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset) %>%
  dplyr::summarise(Average_Age = mean(age, na.rm = TRUE), .groups = 'drop')

print(age_summary)
# A tibble: 2 × 2
  dataset    Average_Age
  <chr>            <dbl>
1 Adaptation        70.1
2 Testing           72.1

Analytic Data Preparation

Code
# Read in Destrieux atlas data
data("desterieux", package = "ggseg")  # Note: the ggseg package adds an extra 'e' in the spelling of the name Destrieux, so it's NOT a typo

# Assign atlas data to data frames in local environment
desterieux_dims <- desterieux$data
desterieux <- desterieux

# Plot simple atlas features to test data frame with dimension data
plot(desterieux_dims) +
  theme(legend.position = "bottom",
        legend.text = element_text(size = 7)) +
  guides(fill = guide_legend(ncol = 3))

NULL
Code
# Plot atlas ROIs with labels to test data frame with complete vectors
ggplot() +
  geom_brain(atlas = desterieux)

Code
# Rename data frame for efficient code modeling, using dataset with volumetric FreeSurfer measures for ALL subjects
df_merged <- read_csv("ADNI_274_Merged_Final.csv") # to be used ONLY for group cortical thickness comparisons

# Rename biomarker variable
df_merged <- dplyr::rename(df_merged, biomarker = "BioMarkers")

# Specifying the order of the first few columns for easier viewing
desired_order <- c("Subject_ID", "age", "sex", "site", "biomarker")

# Append the remaining column names that are not specified in desired_order
remaining_cols <- setdiff(names(df_merged), desired_order)

# Combine the specified order with the remaining columns
new_order <- c(desired_order, remaining_cols)

# Reorder the columns in df according to new_order
df_merged <- df_merged[, new_order]

# Recode biomarker variable
df_merged$biomarker <- recode(df_merged$biomarker,
                              "A-C-" = "HC",
                              "A+C-" = "HC",
                              "A-C+" = "MCI",
                              "A+C+" = "AD")
df_merged$biomarker <- as.factor(df_merged$biomarker)
df_merged$biomarker <- factor(df_merged$biomarker, levels = c("HC", "MCI", "AD")) # explicitly set the reference level

# Recode site variable
df_merged$site <- as.factor(df_merged$site)
df_merged$site <- gsub("1", "GE", df_merged$site)
df_merged$site <- gsub("2", "Philips", df_merged$site)
df_merged$site <- gsub("3", "Siemens", df_merged$site)

# Recode sex variable, 0 = females 1 = males
df_merged$sex <- factor(df_merged$sex)
df_merged$sex <- gsub("0", "Female", df_merged$sex)
df_merged$sex <- gsub("1", "Male", df_merged$sex)

Demographic Descriptives and Statistical Analysis of Cortical Thickness

Code
# Count of each variable of interest
table(df_merged$biomarker)

 HC MCI  AD 
175  40  59 
Code
table(df_merged$sex)

Female   Male 
   153    121 
Code
table(df_merged$site)

     GE Philips Siemens 
     49      48     177 
Code
# Cross-Tabulation of average thickness and biomarker group
describeBy(df_merged$Mean_Thickness, df_merged$biomarker)

 Descriptive statistics by group 
group: HC
   vars   n mean   sd median trimmed  mad  min  max range  skew kurtosis   se
X1    1 175 2.43 0.08   2.43    2.43 0.07 2.17 2.65  0.48 -0.15     0.12 0.01
------------------------------------------------------------ 
group: MCI
   vars  n mean   sd median trimmed  mad  min  max range skew kurtosis   se
X1    1 40  2.4 0.08   2.41     2.4 0.07 2.24 2.59  0.35 -0.1    -0.28 0.01
------------------------------------------------------------ 
group: AD
   vars  n mean   sd median trimmed mad  min  max range  skew kurtosis   se
X1    1 59 2.35 0.09   2.34    2.35 0.1 2.16 2.55  0.39 -0.08    -0.83 0.01
Code
d<-(describeBy(df_merged$Mean_Thickness, df_merged$biomarker))

# Cross-Tabulation of age and biomarker group
describeBy(df_merged$age, df_merged$biomarker)

 Descriptive statistics by group 
group: HC
   vars   n mean   sd median trimmed  mad min max range skew kurtosis   se
X1    1 175 70.1 5.94     69   69.81 5.93  55  90    35 0.53     0.66 0.45
------------------------------------------------------------ 
group: MCI
   vars  n  mean   sd median trimmed mad min max range  skew kurtosis   se
X1    1 40 71.97 8.81     72   72.16 8.9  56  88    32 -0.22    -0.87 1.39
------------------------------------------------------------ 
group: AD
   vars  n  mean   sd median trimmed  mad min max range  skew kurtosis   se
X1    1 59 72.25 7.73     73   72.43 7.41  55  89    34 -0.19    -0.33 1.01
Code
mean(df_merged$age)
[1] 70.83942
Code
sd(df_merged$age)
[1] 6.871648
Code
# Generate the violin plot of age stratified by phenotype categories
ggplot(df_merged, aes(x = biomarker, y = age, fill = biomarker)) +
  geom_violin() +
  labs(title = "Distribution of Age Stratified by Phenotype Groups",
       x = "Phenotype Group", y = "Age") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 0, hjust = 1))  # Improve readability of x-axis labels

Code
# Cross-Tabulation of sex & biomarker, visualization
ggplot(df_merged, aes(x = biomarker, fill = sex)) +
  geom_bar(position = "dodge") +
  labs(title = "Distribution of Sex Across Phenotype Groups",
       x = "Phenotype Group",
       y = "Count",
       fill = "Sex") +
  theme_minimal()

Code
# Create the summary Mean_Thickness data frame
filtered_MT_summary <- df_merged %>%
  group_by(biomarker) %>%
  dplyr::summarise(
    score_mean = mean(Mean_Thickness, na.rm = TRUE),
    n = n(), # Sample size for each group
    sem = sd(Mean_Thickness, na.rm = TRUE) / sqrt(n()),
    .groups = 'drop'
  )

# Create the rain cloud plot with separate components
ggplot(df_merged, aes(x = biomarker, y = Mean_Thickness, fill = biomarker, color = biomarker)) +
  PupillometryR::geom_flat_violin(adjust = 1.5, trim = FALSE, alpha = 0.5, position = position_nudge(x = 0.2, y = 0), colour = NA) +
  geom_point(aes(x = as.numeric(biomarker)-0.25, y = Mean_Thickness, colour = biomarker), position = position_jitter(width = .05), size = .5, shape = 20) +
  geom_boxplot(outlier.shape = NA, alpha = 0.5, width = 0.25, position = position_dodge(width = 0.3), colour = "black") +
  geom_point(data = filtered_MT_summary, aes(x = factor(biomarker), y = score_mean), shape = 18, position = position_nudge(x = 0.2)) +
  geom_errorbar(data = filtered_MT_summary, aes(x = factor(biomarker), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width = 0.05, position = position_nudge(x = 0.2)) +
  labs(
    title = "Distribution of Mean Cortical Thickness Stratified by Phenotype Groups", 
    y = "Score", 
    x = "Biomarker", 
    fill = "Phenotype Group",  # Legend title for fill
    color = "Phenotype Group"  # Legend title for color
  ) +
  theme_minimal() +
  theme(
    legend.position = "right",
    legend.title = element_text(face = "bold")  # Make legend title bold
  )

Code
# Fit a linear model of Mean_Thickness as a function of biomarker and store it in s_2
s_2 <- lm(Mean_Thickness ~ biomarker, data = df_merged)

# Create a table for the linear model with confidence intervals
tab_model(s_2, title = "Linear Model: Average Cortical Thickness as a Function of Biomarker")
Linear Model: Average Cortical Thickness as a Function of Biomarker
  Mean Thickness
Predictors Estimates CI p
(Intercept) 2.43 2.42 – 2.44 <0.001
biomarker [MCI] -0.03 -0.06 – 0.00 0.079
biomarker [AD] -0.08 -0.10 – -0.05 <0.001
Observations 274
R2 / R2 adjusted 0.126 / 0.119
Code
# Perform Tukey's Honest Significant Difference test and store results
tukey_results_s2 <- TukeyHSD(aov(Mean_Thickness ~ biomarker, data = df_merged))
print(tukey_results_s2)

Tukey multiple comparisons of means 95% family-wise confidence level

Fit: aov(formula = Mean_Thickness ~ biomarker, data = df_merged)

$biomarker diff lwr upr p adj MCI-HC -0.02631591 -0.06143749 0.008805661 0.1831271 AD-HC -0.07969303 -0.10986243 -0.049523633 0.0000000 AD-MCI -0.05337712 -0.09442260 -0.012331635 0.0067581

Code
tukey_data_s2 <- as.data.frame(tukey_results_s2$biomarker)
tukey_data_s2$Comparison <- rownames(tukey_data_s2)
tukey_data_s2$significant <- ifelse(tukey_data_s2$`p adj` < 0.05, "Significant", "Not Significant")

# Creating the plot of the Tukey HSD test with significance indication
ggplot(tukey_data_s2, aes(y = Comparison, xmin = lwr, xmax = upr, x = diff)) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
  geom_errorbarh(aes(height = 0.2, color = significant)) +
  geom_point(aes(x = diff, color = significant), size = 2) +
  scale_color_manual(values = c("Significant" = "red", "Not Significant" = "blue")) +
  labs(title = "Tukey HSD Test Results for Mean Cortical Thickness by Biomarker",
       x = "Differences in Mean Levels of Biomarker",
       y = "Comparison",
       color = "P-value Significance") +
  theme_minimal()

Code
# Fit a linear model of Mean_Thickness as a function of biomarker with added covariates
s_1<-lm(Mean_Thickness ~ biomarker + age + sex, data = df_merged)

# Create a table for the linear model including covariates with confidence intervals
tab_model(s_1, title = "Enhanced Linear Model: Average Cortical Thickness as a Function of Biomarker, Adjusted for Age and Sex")
Enhanced Linear Model: Average Cortical Thickness as a Function of Biomarker, Adjusted for Age and Sex
  Mean Thickness
Predictors Estimates CI p
(Intercept) 2.58 2.48 – 2.68 <0.001
biomarker [MCI] -0.02 -0.05 – 0.01 0.221
biomarker [AD] -0.07 -0.10 – -0.05 <0.001
age -0.00 -0.00 – -0.00 0.007
sex [Male] -0.02 -0.04 – 0.00 0.119
Observations 274
R2 / R2 adjusted 0.159 / 0.147
Code
# Plot multiple regression model
ggpredict(s_1, c(terms = "age", "biomarker", "sex")) |>
  plot() +
  labs(title = "Average Cortical Thickness as a Function of Biomarker, Adjusted for Age and Sex",
       x = "Age",
       y = "Mean Cortical Thickness (mm)",
       caption = "Note: 'biomarker' factors, 'HC' = Healthy Control; 'MCI' = Mild Cognitive Impairment; & 'AD' = Alzheimer's Disease.")

Regional Analyses: Across-Groups & Pairwise Comparisons

Code
# Apply suffix to all ROI FreeSurfer measures
df.rois <- df_merged %>% rename_at(vars((6:153)), ~ paste0(., '_rois'))

# Run ANOVA analyses for each ROI across biomarker group and extract just the p-values
df.stats <- as.data.frame(sapply(X = df.rois[,grep("_rois", names(df.rois),value = T)], FUN = function(x) summary(aov(x ~ df.rois$biomarker))[[1]][["Pr(>F)"]][1]))

# Rename columns in ANOVA output data frame
names(df.stats) <- "p_value"
setDT(df.stats, keep.rownames = "ROI")

# Verify that changes were made via the first 20 ROIs
head(df.stats, 20)
                             ROI      p_value
                          <char>        <num>
 1:      L_G&S_frontomargin_rois 4.069293e-01
 2:     L_G&S_occipital_inf_rois 1.707275e-03
 3:       L_G&S_paracentral_rois 5.474064e-01
 4:        L_G&S_subcentral_rois 2.408362e-02
 5:  L_G&S_transv_frontopol_rois 8.369497e-01
 6:        L_G&S_cingul-Ant_rois 7.697393e-01
 7:    L_G&S_cingul-Mid-Ant_rois 1.248427e-01
 8:   L_G&S_cingul-Mid-Post_rois 4.273325e-02
 9:  L_G_cingul-Post-dorsal_rois 2.454827e-05
10: L_G_cingul-Post-ventral_rois 1.128955e-01
11:              L_G_cuneus_rois 5.383524e-01
12: L_G_front_inf-Opercular_rois 8.558583e-03
13:   L_G_front_inf-Orbital_rois 1.877900e-02
14:  L_G_front_inf-Triangul_rois 1.318903e-01
15:        L_G_front_middle_rois 1.673422e-03
16:           L_G_front_sup_rois 2.681659e-02
17:   L_G_Ins_lg&S_cent_ins_rois 2.452448e-02
18:       L_G_insular_short_rois 5.823087e-02
19:    L_G_occipital_middle_rois 6.015292e-06
20:       L_G_occipital_sup_rois 1.052752e-02
                             ROI      p_value
Code
# Run False Discovery Rate (FDR) corrections 
df.stats <- cbind(df.stats, p.adjust(df.stats$p_value), method = "fdr")

# Rename column of FDR-corrected p-values
names(df.stats)[3] <- "FDR.pvalue"

# Verify that changes were made
head(df.stats, 20)
                             ROI      p_value   FDR.pvalue method
                          <char>        <num>        <num> <char>
 1:      L_G&S_frontomargin_rois 4.069293e-01 1.0000000000    fdr
 2:     L_G&S_occipital_inf_rois 1.707275e-03 0.1519474631    fdr
 3:       L_G&S_paracentral_rois 5.474064e-01 1.0000000000    fdr
 4:        L_G&S_subcentral_rois 2.408362e-02 1.0000000000    fdr
 5:  L_G&S_transv_frontopol_rois 8.369497e-01 1.0000000000    fdr
 6:        L_G&S_cingul-Ant_rois 7.697393e-01 1.0000000000    fdr
 7:    L_G&S_cingul-Mid-Ant_rois 1.248427e-01 1.0000000000    fdr
 8:   L_G&S_cingul-Mid-Post_rois 4.273325e-02 1.0000000000    fdr
 9:  L_G_cingul-Post-dorsal_rois 2.454827e-05 0.0028475988    fdr
10: L_G_cingul-Post-ventral_rois 1.128955e-01 1.0000000000    fdr
11:              L_G_cuneus_rois 5.383524e-01 1.0000000000    fdr
12: L_G_front_inf-Opercular_rois 8.558583e-03 0.6247765493    fdr
13:   L_G_front_inf-Orbital_rois 1.877900e-02 1.0000000000    fdr
14:  L_G_front_inf-Triangul_rois 1.318903e-01 1.0000000000    fdr
15:        L_G_front_middle_rois 1.673422e-03 0.1506079974    fdr
16:           L_G_front_sup_rois 2.681659e-02 1.0000000000    fdr
17:   L_G_Ins_lg&S_cent_ins_rois 2.452448e-02 1.0000000000    fdr
18:       L_G_insular_short_rois 5.823087e-02 1.0000000000    fdr
19:    L_G_occipital_middle_rois 6.015292e-06 0.0007338656    fdr
20:       L_G_occipital_sup_rois 1.052752e-02 0.7369265130    fdr
                             ROI      p_value   FDR.pvalue method
Code
### Repeat steps to now obtain the F-stat values
# Run ANOVA analyses for each ROI across biomarker group and extract just the F-stat values
df.f_stats <- as.data.frame(sapply(X = df.rois[,grep("_rois", names(df.rois),value = T)], FUN = function(x) summary(aov(x ~ df.rois$biomarker))[[1]][["F value"]][1]))

# Rename columns in ANOVA output data frame
names(df.f_stats) <- "f_stat"
setDT(df.f_stats, keep.rownames = "ROI")

# Verify that changes were made via the first 20 ROIs
head(df.f_stats, 20)
                             ROI     f_stat
                          <char>      <num>
 1:      L_G&S_frontomargin_rois  0.9021055
 2:     L_G&S_occipital_inf_rois  6.5250988
 3:       L_G&S_paracentral_rois  0.6039056
 4:        L_G&S_subcentral_rois  3.7779316
 5:  L_G&S_transv_frontopol_rois  0.1781083
 6:        L_G&S_cingul-Ant_rois  0.2619563
 7:    L_G&S_cingul-Mid-Ant_rois  2.0967579
 8:   L_G&S_cingul-Mid-Post_rois  3.1897430
 9:  L_G_cingul-Post-dorsal_rois 11.0417190
10: L_G_cingul-Post-ventral_rois  2.1989441
11:              L_G_cuneus_rois  0.6206591
12: L_G_front_inf-Opercular_rois  4.8454451
13:   L_G_front_inf-Orbital_rois  4.0338957
14:  L_G_front_inf-Triangul_rois  2.0410037
15:        L_G_front_middle_rois  6.5460925
16:           L_G_front_sup_rois  3.6674894
17:   L_G_Ins_lg&S_cent_ins_rois  3.7592872
18:       L_G_insular_short_rois  2.8733819
19:    L_G_occipital_middle_rois 12.5705761
20:       L_G_occipital_sup_rois  4.6311462
                             ROI     f_stat
Code
# List the ROIs that are significant after FDR correction
df.stats[which(df.stats$FDR.pvalue <0.05),]
                             ROI      p_value   FDR.pvalue method
                          <char>        <num>        <num> <char>
 1:  L_G_cingul-Post-dorsal_rois 2.454827e-05 2.847599e-03    fdr
 2:    L_G_occipital_middle_rois 6.015292e-06 7.338656e-04    fdr
 3: L_G_oc-temp_lat-fusifor_rois 4.435059e-04 4.346357e-02    fdr
 4: L_G_oc-temp_med-Parahip_rois 7.044673e-14 1.035567e-11    fdr
 5:  L_G_pariet_inf-Angular_rois 1.808600e-09 2.550126e-07    fdr
 6: L_G_pariet_inf-Supramar_rois 6.562463e-07 8.531201e-05    fdr
 7:        L_G_parietal_sup_rois 6.957297e-07 8.974913e-05    fdr
 8:           L_G_precuneus_rois 2.798773e-05 3.218589e-03    fdr
 9:    L_G_temp_sup-Lateral_rois 3.102014e-08 4.280780e-06    fdr
10: L_G_temp_sup-Plan_polar_rois 2.020428e-11 2.909417e-09    fdr
11: L_G_temp_sup-Plan_tempo_rois 1.294087e-05 1.552905e-03    fdr
12:        L_G_temporal_inf_rois 1.751079e-06 2.206360e-04    fdr
13:     L_G_temporal_middle_rois 4.475073e-12 6.488856e-10    fdr
14:          L_Lat_Fis-post_rois 4.221147e-05 4.601051e-03    fdr
15:         L_Pole_temporal_rois 1.068375e-09 1.527776e-07    fdr
16: L_S_circular_insula_inf_rois 4.064392e-05 4.521522e-03    fdr
17:   L_S_collat_transv_ant_rois 4.739050e-05 5.118174e-03    fdr
18:        L_S_front_middle_rois 2.619400e-04 2.619400e-02    fdr
19:           L_S_front_sup_rois 1.760990e-04 1.813820e-02    fdr
20:  L_S_interm_prim-Jensen_rois 1.361353e-04 1.418534e-02    fdr
21: L_S_intrapariet&P_trans_rois 4.726364e-09 6.616910e-07    fdr
22:  L_S_oc_sup&transversal_rois 1.740234e-07 2.331914e-05    fdr
23:       L_S_occipital_ant_rois 1.874682e-07 2.493327e-05    fdr
24:         L_S_oc-temp_lat_rois 5.478431e-08 7.450666e-06    fdr
25: L_S_oc-temp_med&Lingual_rois 5.016967e-06 6.221040e-04    fdr
26:  L_S_orbital_med-olfact_rois 3.978362e-04 3.938578e-02    fdr
27:   L_S_parieto_occipital_rois 7.095764e-08 9.579282e-06    fdr
28:         L_S_postcentral_rois 4.053353e-05 4.521522e-03    fdr
29:         L_S_subparietal_rois 4.037073e-05 4.521522e-03    fdr
30:        L_S_temporal_inf_rois 2.095310e-08 2.912480e-06    fdr
31:        L_S_temporal_sup_rois 9.850650e-15 1.457896e-12    fdr
32:  R_G_cingul-Post-dorsal_rois 2.505947e-04 2.531007e-02    fdr
33:    R_G_occipital_middle_rois 6.687443e-05 7.155564e-03    fdr
34: R_G_oc-temp_med-Parahip_rois 4.455720e-08 6.104336e-06    fdr
35:  R_G_pariet_inf-Angular_rois 2.163472e-07 2.855784e-05    fdr
36:        R_G_parietal_sup_rois 1.659870e-05 1.958647e-03    fdr
37:           R_G_precuneus_rois 3.389987e-05 3.864585e-03    fdr
38: R_G_temp_sup-Plan_polar_rois 3.534015e-05 3.993437e-03    fdr
39:        R_G_temporal_inf_rois 1.127403e-04 1.195047e-02    fdr
40:     R_G_temporal_middle_rois 1.394761e-09 1.980561e-07    fdr
41:          R_Lat_Fis-post_rois 1.912784e-04 1.951040e-02    fdr
42:         R_Pole_temporal_rois 3.828107e-06 4.785134e-04    fdr
43: R_S_circular_insula_inf_rois 8.345169e-06 1.009765e-03    fdr
44:  R_S_interm_prim-Jensen_rois 5.029451e-06 6.221040e-04    fdr
45: R_S_intrapariet&P_trans_rois 1.245576e-06 1.581881e-04    fdr
46:  R_S_oc_sup&transversal_rois 1.350984e-04 1.418534e-02    fdr
47: R_S_oc-temp_med&Lingual_rois 1.474772e-05 1.754978e-03    fdr
48:   R_S_parieto_occipital_rois 2.161362e-05 2.528793e-03    fdr
49:         R_S_subparietal_rois 2.382738e-07 3.121387e-05    fdr
50:        R_S_temporal_inf_rois 7.332961e-07 9.386190e-05    fdr
51:        R_S_temporal_sup_rois 2.418071e-12 3.530383e-10    fdr
                             ROI      p_value   FDR.pvalue method
Code
# Assign significance levels
df.stats$Significance <- ifelse(df.stats$FDR.pvalue < 0.001, '***',
                               ifelse(df.stats$FDR.pvalue < 0.01, '**',
                               ifelse(df.stats$FDR.pvalue < 0.05, '*', '')))

# Merge data frames for visualization
df.stats_merged <- merge(df.stats, df.f_stats, by = "ROI")

# Melt the data for plotting
df.melted <- melt(df.stats_merged, id.vars = "ROI", variable.name = "Statistic", value.name = "Value")

# Ensure that 'Value' is numeric
df.melted$Value <- as.numeric(as.character(df.melted$Value))

# Ensure df.stats and df.melted are merged to filter and sort
df.melted <- df.melted %>%
  filter(Statistic == "f_stat") %>%
  left_join(df.stats %>% select(ROI, Significance, FDR.pvalue), by = "ROI") %>%
  filter(FDR.pvalue < 0.05) %>%
  arrange(match(Significance, c("", "*", "**", "***")))  # Order by significance levels

# Remove suffix from ROI list
df.melted$ROI <- gsub("_rois", "", df.melted$ROI)

# Define group positions and labels
group_positions <- data.frame(
  start = c(1, 15, 30),  # example start positions
  end = c(14, 29, 51),    # example end positions
  label = c("*", "**", "***")  # example labels for significance
)

# Create the heatmap
heatmap_plot <- ggplot(df.melted, aes(x = ROI, y = Statistic, fill = Value)) +
  geom_tile() +
  scale_fill_gradient(low = "blue", high = "red") +
  labs(title = "Heatmap of ANOVA Statistics Across ROIs", x = "Region of Interest", y = "F-Statistic") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

# Add custom brackets using annotations
for (i in 1:nrow(group_positions)) {
  heatmap_plot <- heatmap_plot +
    annotate("segment", x = group_positions$start[i], xend = group_positions$end[i], y = Inf, yend = Inf, yjust = 1.5, color = "black", size = 0.5) +
    annotate("text", x = (group_positions$start[i] + group_positions$end[i]) / 2, y = Inf, label = group_positions$label[i], vjust = 2, hjust = 0.5)
}

print(heatmap_plot)

Code
ggsave("~/Downloads/ANOVA_heatmap.pdf")
Code
# Remove suffix from ROI list
df.stats$ROI <- as.data.frame(gsub("_rois", "", df.stats$ROI))

### Rename ROIs to assign nomenclature consistent with the 'desterieux' atlas package
# Left hemisphere ROIs
left.df.stats <- df.stats %>%
  filter(str_detect(ROI, "L_"))
x <- gsub("L_", "", left.df.stats$ROI)
y <- gsub("G&S_", "G and S ", x)
z <- gsub("G_", "G ", y)
z1 <- gsub("S_", "S ", z)
z2 <- gsub("_", " ", z1)
z3 <- gsub(" bin", "", z2)
z4 <- gsub("\\.", " ", z3)
z5 <- gsub("cingul ", "cingul-", z4)
z6 <- gsub("Mid ", "Mid-", z5)
z7 <- gsub("Post ", "Post-", z6)
z8 <- gsub("inf ", "inf-", z7)
z9 <- gsub("med ", "med-", z8)
z10 <- gsub("sup ", "sup-", z9)
z11 <- gsub("Fis ant ", "Fis-ant-", z10)
z12 <- gsub("precentral ", "precentral-", z11)
z13 <- gsub("Fis pos ", "Fis-pos-", z12)
z14 <- gsub("lg&S", "lg and S", z13)
z15 <- gsub("oc temp", "oc-temp", z14)
z16 <- gsub("sup and transversal", "sup-transversal", z15)
z17 <- gsub("orbital H Shaped", "orbital-H Shaped", z16)
z18 <- gsub("oc sup&transversal", "oc sup and transversal", z17)
z19 <- gsub("prim Jensen", "prim-Jensen", z18)
z20 <- gsub("S oc-temp med&Lingual", "S oc-temp med and Lingual", z19)
z21 <- gsub("lat fusifor", "lat-fusifor", z20)
z22 <- gsub("middle&Lunatus", "middle and Lunatus", z21)
z23 <- gsub("intrapariet&P trans", "intrapariet and P trans", z22)
renamed_ROIs <- gsub("Lat Fis post", "Lat Fis-post", z23)

desterieux_ROIs <- as.data.frame(desterieux_dims %>% filter(hemi == "left"))$region
compare_lists <- cbind(sort(renamed_ROIs), sort(unique(desterieux_ROIs)))
list_matches <- compare_lists[,1] %in% compare_lists[,2]
compare_lists[!list_matches,]
     [,1] [,2]
Code
## if no mismatches, than add to data.frame as region
left.df.stats$region <- renamed_ROIs

# Right hemisphere ROIs
right.df.stats <- df.stats %>%
  filter(str_detect(ROI, "R_"))
x <- gsub("R_", "", right.df.stats$ROI)
y <- gsub("G&S_", "G and S ", x)
z <- gsub("G_", "G ", y)
z1 <- gsub("S_", "S ", z)
z2 <- gsub("_", " ", z1)
z3 <- gsub(" bin", "", z2)
z4 <- gsub("\\.", " ", z3)
z5 <- gsub("cingul ", "cingul-", z4)
z6 <- gsub("Mid ", "Mid-", z5)
z7 <- gsub("Post ", "Post-", z6)
z8 <- gsub("inf ", "inf-", z7)
z9 <- gsub("med ", "med-", z8)
z10 <- gsub("sup ", "sup-", z9)
z11 <- gsub("Fis ant ", "Fis-ant-", z10)
z12 <- gsub("precentral ", "precentral-", z11)
z13 <- gsub("Fis pos ", "Fis-pos-", z12)
z14 <- gsub("lg&S", "lg and S", z13)
z15 <- gsub("oc temp", "oc-temp", z14)
z16 <- gsub("sup and transversal", "sup-transversal", z15)
z17 <- gsub("orbital H Shaped", "orbital-H Shaped", z16)
z18 <- gsub("oc sup&transversal", "oc sup and transversal", z17)
z19 <- gsub("prim Jensen", "prim-Jensen", z18)
z20 <- gsub("S oc-temp med&Lingual", "S oc-temp med and Lingual", z19)
z21 <- gsub("lat fusifor", "lat-fusifor", z20)
z22 <- gsub("middle&Lunatus", "middle and Lunatus", z21)
z23 <- gsub("intrapariet&P trans", "intrapariet and P trans", z22)
renamed_ROIs <- gsub("Lat Fis post", "Lat Fis-post", z23)

desterieux_ROIs <- as.data.frame(desterieux_dims %>% filter(hemi == "right"))$region
compare_lists <- cbind(sort(renamed_ROIs), sort(unique(desterieux_ROIs)))
list_matches <- compare_lists[,1] %in% compare_lists[,2]
compare_lists[!list_matches,]
     [,1] [,2]
Code
## if no mismatches, than add to data.frame as region
right.df.stats$region <- renamed_ROIs
Code
# Left hemisphere
left_pvalues <- ggseg(.data=left.df.stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_pvalues <- ggseg(.data=right.df.stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

# Add titles to individual plots
left_pvalues <- left_pvalues + labs(title = "Between-Group Comparisons")
right_pvalues <- right_pvalues + labs(title = "Between-Group Comparisons")

cowplot::plot_grid(left_pvalues,right_pvalues, nrow = 2, labels = "AUTO")

Code
ggsave("~/Downloads/corticalthicknesspvalue_maps.pdf")
Code
# Data frame preparation
df.CA <- df_merged[grep("HC|AD", df_merged$biomarker), ] # Subset just Controls and Alzheimer's ("CA")
df.CM <- df_merged[grep("HC|MCI", df_merged$biomarker), ] # Subset just Controls and MCI ("CM")
df.MA <- df_merged[grep("MCI|AD", df_merged$biomarker), ] # Subset just MCI and Alzheimer's ("MA")

### Control vs Alzheimer's t-test
# Perform t-test and extract t-stat value, p-value, and parameters
df.CA_stats <- as.data.frame(sapply(df.CA[6:153], function(x) t.test(x ~ df.CA$biomarker)$statistic))
df.CA_stats1 <- as.data.frame(sapply(df.CA[6:153], function(x) t.test(x ~ df.CA$biomarker)$p.value))
df.CA_stats2 <- as.data.frame(sapply(df.CA[6:153], function(x) t.test(x ~ df.CA$biomarker)$parameter))

# Merge data frames and remove the subsequent, intermediate data frames
df.CA_stats <- cbind(df.CA_stats, df.CA_stats1)
df.CA_stats <- cbind(df.CA_stats, df.CA_stats2)
rm(df.CA_stats1, df.CA_stats2)

# Clean up the t-test data frame
df.CA_stats$t_statistic <- df.CA_stats[2] # Rename column
names(df.CA_stats) <- c('statistic', 'p.value', 'parameter')
df.CA_stats[4] <- NULL

# Perform FDR correction on t-test data frame
df.CA_stats <- cbind(df.CA_stats, p.adjust(df.CA_stats$p.value), method = "fdr")
names(df.CA_stats)[4] <- "FDR.pvalue"
df.CA_stats[5] <- NULL

# Clean up the FDR-corrected data frame
df.CA_stats <- tibble::rownames_to_column(df.CA_stats, "ROI") # Create ROI column
df.CA_stats$ROI <- gsub("\\.t$", "", df.CA_stats$ROI)

# List the ROIs that are significant after FDR correction
df.CA_stats_sig <- df.CA_stats[which(df.CA_stats$FDR.pvalue <0.05),]

### Control vs MCI t-test
# Perform t-test and extract t-stat value, p-value, and parameters
df.CM_stats <- as.data.frame(sapply(df.CM[6:153], function(x) t.test(x ~ df.CM$biomarker)$statistic))
df.CM_stats1 <- as.data.frame(sapply(df.CM[6:153], function(x) t.test(x ~ df.CM$biomarker)$p.value))
df.CM_stats2 <- as.data.frame(sapply(df.CM[6:153], function(x) t.test(x ~ df.CM$biomarker)$parameter))

# Merge data frames and remove the subsequent, intermediate data frames
df.CM_stats <- cbind(df.CM_stats, df.CM_stats1)
df.CM_stats <- cbind(df.CM_stats, df.CM_stats2)
rm(df.CM_stats1, df.CM_stats2)

# Clean up the t-test data frame
df.CM_stats$t_statistic <- df.CM_stats[2] # Rename column
names(df.CM_stats) <- c('statistic', 'p.value', 'parameter')
df.CM_stats[4] <- NULL

# Perform FDR correction on t-test data frame
df.CM_stats <- cbind(df.CM_stats, p.adjust(df.CM_stats$p.value), method = "fdr")
names(df.CM_stats)[4] <- "FDR.pvalue"
df.CM_stats[5] <- NULL

# Clean up the FDR-corrected data frame
df.CM_stats <- tibble::rownames_to_column(df.CM_stats, "ROI") # Create ROI column
df.CM_stats$ROI <- gsub("\\.t$", "", df.CM_stats$ROI)

# List the ROIs that are significant after FDR correction
df.CM_stats_sig <- df.CM_stats[which(df.CM_stats$FDR.pvalue <0.05),]

### MCI vs Alzheimer's t-test
# Perform t-test and extract t-stat value, p-value, and parameters
df.MA_stats <- as.data.frame(sapply(df.MA[6:153], function(x) t.test(x ~ df.MA$biomarker)$statistic))
df.MA_stats1 <- as.data.frame(sapply(df.MA[6:153], function(x) t.test(x ~ df.MA$biomarker)$p.value))
df.MA_stats2 <- as.data.frame(sapply(df.MA[6:153], function(x) t.test(x ~ df.MA$biomarker)$parameter))

# Merge data frames and remove the subsequent, intermediate data frames
df.MA_stats <- cbind(df.MA_stats, df.MA_stats1)
df.MA_stats <- cbind(df.MA_stats, df.MA_stats2)
rm(df.MA_stats1, df.MA_stats2)

# Clean up the t-test data frame
df.MA_stats$t_statistic <- df.MA_stats[2] # Rename column
names(df.MA_stats) <- c('statistic', 'p.value', 'parameter')
df.MA_stats[4] <- NULL

# Perform FDR correction on t-test data frame
df.MA_stats <- cbind(df.MA_stats, p.adjust(df.MA_stats$p.value), method = "fdr")
names(df.MA_stats)[4] <- "FDR.pvalue"
df.MA_stats[5] <- NULL

# Clean up the FDR-corrected data frame
df.MA_stats <- tibble::rownames_to_column(df.MA_stats, "ROI") # Create ROI column
df.MA_stats$ROI <- gsub("\\.t$", "", df.MA_stats$ROI)

# List the ROIs that are significant after FDR correction
df.MA_stats_sig <- df.MA_stats[which(df.MA_stats$FDR.pvalue <0.05),]

# Combine the significant results for easy plotting
combined_stats_sig <- bind_rows(
  df.CA_stats_sig %>% mutate(Comparison = "HC vs AD"),
  df.MA_stats_sig %>% mutate(Comparison = "MCI vs AD")
)

# Assign significance levels
combined_stats_sig$Significance <- ifelse(combined_stats_sig$FDR.pvalue < 0.001, '***',
                               ifelse(combined_stats_sig$FDR.pvalue < 0.01, '**',
                               ifelse(combined_stats_sig$FDR.pvalue < 0.05, '*', '')))

# Create the lollipop plot
ggplot(combined_stats_sig, aes(x = reorder(ROI, statistic), y = statistic, color = Comparison)) +
  geom_segment(aes(yend = 0, xend = ROI), size = 1, linetype = "dashed") +
  geom_point(size = 3) +
  labs(
    title = "Significant Differences in Brain Regions Across Comparisons",
    x = "Region of Interest",
    y = "T-Statistic Value",
    color = "Comparison",
    caption = "Note: 'HC' = Healthy Control; 'MCI' = Mild Cognitive Impairment; & 'AD' = Alzheimer's Disease.") +
  theme_minimal() +
  theme(
    legend.position = "top",
    axis.text.x = element_text(angle = 90, hjust = 1, size = 7.5)  # Rotate x labels for better visibility
  )

Code
ggsave("~/Downloads/lollipop_plot.pdf")
Code
### Rename ROIs to assign nomenclature consistent with the 'desterieux' atlas package
# Automated loop version
rename_rois <- function(df, hemi, desterieux_dims) {
  # Filter based on hemisphere
  hemi_prefix <- ifelse(hemi == "left", "L_", "R_")
  df_filtered <- df %>%
    filter(str_detect(ROI, hemi_prefix))
  
  # Perform renaming steps
  x <- gsub(paste0(hemi_prefix), "", df_filtered$ROI)
  patterns <- c("G&S_", "G_", "S_", "_", " bin", "\\.", "cingul ", "Mid ", "Post ", 
                "inf ", "med ", "sup ", "Fis ant ", "precentral ", "Fis pos ", "lg&S", 
                "oc temp", "sup and transversal", "orbital H Shaped", "oc sup&transversal", 
                "prim Jensen", "S oc-temp med&Lingual", "lat fusifor", "middle&Lunatus", 
                "intrapariet&P trans", "Lat Fis post")
  replacements <- c("G and S ", "G ", "S ", " ", "", " ", "cingul-", "Mid-", "Post-", 
                    "inf-", "med-", "sup-", "Fis-ant-", "precentral-", "Fis-pos-", 
                    "lg and S", "oc-temp", "sup-transversal", "orbital-H Shaped", 
                    "oc sup and transversal", "prim-Jensen", "S oc-temp med and Lingual", 
                    "lat-fusifor", "middle and Lunatus", "intrapariet and P trans", 
                    "Lat Fis-post")
  for (i in seq_along(patterns)) {
    x <- gsub(patterns[i], replacements[i], x)
  }
  
  # Match with desterieux ROIs
  desterieux_ROIs <- as.data.frame(desterieux_dims %>% filter(hemi == hemi))$region
  compare_lists <- cbind(sort(x), sort(unique(desterieux_ROIs)))
  list_matches <- compare_lists[,1] %in% compare_lists[,2]
  
  if (any(!list_matches)) {
    warning("There are mismatches in ROI names for the ", hemi, " hemisphere.")
  }
  
  df_filtered$region <- x
  return(df_filtered)
}

# Apply the function to each dataset and hemisphere
left.df.CA_stats <- rename_rois(df.CA_stats, "left", desterieux_dims)
right.df.CA_stats <- rename_rois(df.CA_stats, "right", desterieux_dims)
left.df.CM_stats <- rename_rois(df.CM_stats, "left", desterieux_dims)
right.df.CM_stats <- rename_rois(df.CM_stats, "right", desterieux_dims)
left.df.MA_stats <- rename_rois(df.MA_stats, "left", desterieux_dims)
right.df.MA_stats <- rename_rois(df.MA_stats, "right", desterieux_dims)
Code
### Control vs Alzheimer's
# Left hemisphere
left_CA_pvalues <- ggseg(.data=left.df.CA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_CA_pvalues <- ggseg(.data=right.df.CA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

### Control vs MCI
# Left hemisphere
left_CM_pvalues <- ggseg(.data=left.df.CM_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_CM_pvalues <- ggseg(.data=right.df.CM_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

### MCI vs Alzheimer's
# Left hemisphere
left_MA_pvalues <- ggseg(.data=left.df.MA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_MA_pvalues <- ggseg(.data=right.df.MA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

# Add titles to individual plots
left_CA_pvalues <- left_CA_pvalues + labs(title = "Healthy Controls vs Alzheimer's Group")
right_CA_pvalues <- right_CA_pvalues + labs(title = "Healthy Controls vs Alzheimer's Group")

left_CM_pvalues <- left_CM_pvalues + labs(title = "Healthy Controls vs MCI Group")
right_CM_pvalues <- right_CM_pvalues + labs(title = "Healthy Controls vs MCI Group")

left_MA_pvalues <- left_MA_pvalues + labs(title = "MCI Group vs Alzheimer's Group")
right_MA_pvalues <- right_MA_pvalues + labs(title = "MCI Group vs Alzheimer's Group")

# Print plots to verify that they were correctly constructed
cowplot::plot_grid(left_CA_pvalues, right_CA_pvalues, nrow = 2)

Code
cowplot::plot_grid(left_CM_pvalues, right_CM_pvalues, nrow = 2)

Code
cowplot::plot_grid(left_MA_pvalues, right_MA_pvalues, nrow = 2)

Code
### After verifying
# Start the PDF device
pdf("Pairwise Cortical Thickness p-Value Maps.pdf", width = 11, height = 8.5)

# Lay out plots to be inserted and downloaded into a separate PDF file
cowplot::plot_grid(left_CA_pvalues, right_CA_pvalues, nrow = 2)
cowplot::plot_grid(left_CM_pvalues, right_CM_pvalues, nrow = 2)
cowplot::plot_grid(left_MA_pvalues, right_MA_pvalues, nrow = 2)

# Close the PDF device
dev.off()
quartz_off_screen 
                2 
Code
# Read in PDC data sets
PDC_HBR <- read_csv("~/Downloads/ADNI_NormativeModeling/PDC_HBR.csv")
Rows: 153 Columns: 155
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr   (2): SubjectID, phenotype
dbl (153): age, sex, site, L_G&S_frontomargin, L_G&S_occipital_inf, L_G&S_pa...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
PDC_BLR <- read_csv("~/Downloads/ADNI_NormativeModeling/PDC_BLR.csv")
Rows: 142 Columns: 155
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr   (2): SubjectID, phenotype
dbl (153): age, sex, site, R_Lat_Fis-ant-Horizont, R_G_temp_sup-G_T_transv, ...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
# Merge datasets on SubjectID
merged_data <- inner_join(PDC_HBR, PDC_BLR, by = "SubjectID", suffix = c(".HBR", ".BLR"))

# Specify the brain regions for which correlations will be computed for
brain_regions <- c("L_G&S_frontomargin", "L_G&S_occipital_inf", "L_G&S_paracentral", 
                   "L_G&S_subcentral", "L_G&S_transv_frontopol", "L_G&S_cingul-Ant",
                   "L_G&S_cingul-Mid-Ant", "L_G&S_cingul-Mid-Post", "L_G_cingul-Post-dorsal",
                   "L_G_cingul-Post-ventral", "L_G_cuneus", "L_G_front_inf-Opercular", 
                   "L_G_front_inf-Orbital", "L_G_front_inf-Triangul", "L_G_front_middle", 
                   "L_G_front_sup", "L_G_Ins_lg&S_cent_ins", "L_G_insular_short", 
                   "L_G_occipital_middle", "L_G_occipital_sup", "L_G_oc-temp_lat-fusifor",
                   "L_G_oc-temp_med-Lingual", "L_G_oc-temp_med-Parahip", "L_G_orbital", 
                   "L_G_pariet_inf-Angular", "L_G_pariet_inf-Supramar", "L_G_parietal_sup", 
                   "L_G_postcentral", "L_G_precentral", "L_G_precuneus", "L_G_rectus", 
                   "L_G_subcallosal", "L_G_temp_sup-G_T_transv", "L_G_temp_sup-Lateral",
                   "L_G_temp_sup-Plan_polar", "L_G_temp_sup-Plan_tempo", "L_G_temporal_inf",
                   "L_G_temporal_middle", "L_Lat_Fis-ant-Horizont", "L_Lat_Fis-ant-Vertical",
                   "L_Lat_Fis-post", "L_Pole_occipital", "L_Pole_temporal", "L_S_calcarine", 
                   "L_S_central", "L_S_cingul-Marginalis", "L_S_circular_insula_ant",
                   "L_S_circular_insula_inf", "L_S_circular_insula_sup", "L_S_collat_transv_ant", 
                   "L_S_collat_transv_post", "L_S_front_inf", "L_S_front_middle", "L_S_front_sup", 
                   "L_S_interm_prim-Jensen", "L_S_intrapariet&P_trans", "L_S_oc_middle&Lunatus", 
                   "L_S_oc_sup&transversal", "L_S_occipital_ant", "L_S_oc-temp_lat", 
                   "L_S_oc-temp_med&Lingual", "L_S_orbital_lateral", "L_S_orbital_med-olfact", 
                   "L_S_orbital-H_Shaped", "L_S_parieto_occipital", "L_S_pericallosal", 
                   "L_S_postcentral", "L_S_precentral-inf-part", "L_S_precentral-sup-part", 
                   "L_S_suborbital", "L_S_subparietal", "L_S_temporal_inf", "L_S_temporal_sup", 
                   "L_S_temporal_transverse", "R_G&S_frontomargin", "R_G&S_occipital_inf", 
                   "R_G&S_paracentral", "R_G&S_subcentral", "R_G&S_transv_frontopol", 
                   "R_G&S_cingul-Ant", "R_G&S_cingul-Mid-Ant", "R_G&S_cingul-Mid-Post", 
                   "R_G_cingul-Post-dorsal", "R_G_cingul-Post-ventral", "R_G_cuneus", 
                   "R_G_front_inf-Opercular", "R_G_front_inf-Orbital", "R_G_front_inf-Triangul", 
                   "R_G_front_middle", "R_G_front_sup", "R_G_Ins_lg&S_cent_ins", 
                   "R_G_insular_short", "R_G_occipital_middle", "R_G_occipital_sup", 
                   "R_G_oc-temp_lat-fusifor", "R_G_oc-temp_med-Lingual", "R_G_oc-temp_med-Parahip", 
                   "R_G_orbital", "R_G_pariet_inf-Angular", "R_G_pariet_inf-Supramar", 
                   "R_G_parietal_sup", "R_G_postcentral", "R_G_precentral", "R_G_precuneus", 
                   "R_G_rectus", "R_G_subcallosal", "R_G_temp_sup-G_T_transv", "R_G_temp_sup-Lateral",
                   "R_G_temp_sup-Plan_polar", "R_G_temp_sup-Plan_tempo", "R_G_temporal_inf", 
                   "R_G_temporal_middle", "R_Lat_Fis-ant-Horizont", "R_Lat_Fis-ant-Vertical", 
                   "R_Lat_Fis-post", "R_Pole_occipital", "R_Pole_temporal", "R_S_calcarine", 
                   "R_S_central", "R_S_cingul-Marginalis", "R_S_circular_insula_ant", 
                   "R_S_circular_insula_inf", "R_S_circular_insula_sup", "R_S_collat_transv_ant", 
                   "R_S_collat_transv_post", "R_S_front_inf", "R_S_front_middle", "R_S_front_sup", 
                   "R_S_interm_prim-Jensen", "R_S_intrapariet&P_trans", "R_S_oc_middle&Lunatus", 
                   "R_S_oc_sup&transversal", "R_S_occipital_ant", "R_S_oc-temp_lat", 
                   "R_S_oc-temp_med&Lingual", "R_S_orbital_lateral", "R_S_orbital_med-olfact", 
                   "R_S_orbital-H_Shaped", "R_S_parieto_occipital", "R_S_pericallosal", 
                   "R_S_postcentral", "R_S_precentral-inf-part", "R_S_precentral-sup-part", 
                   "R_S_suborbital", "R_S_subparietal", "R_S_temporal_inf", "R_S_temporal_sup", 
                   "R_S_temporal_transverse")

# Compute correlations for each brain region
correlation_results <- list()

for(region in brain_regions) {
    region_hbr <- paste(region, ".HBR", sep = "")
    region_blr <- paste(region, ".BLR", sep = "")
    
    # Ensure both variables are present in the dataset
    if(region_hbr %in% names(merged_data) && region_blr %in% names(merged_data)) {
        # Use cor.test to get both correlation and p-value
        test_result <- cor.test(merged_data[[region_hbr]], merged_data[[region_blr]], 
                                method = "pearson", use = "complete.obs")
        
        # Store results in a list
        correlation_results[[region]] <- list(
            Correlation = test_result$estimate,
            P_Value = test_result$p.value
        )
    }
}

# Convert the list to a data frame for easy viewing
correlation_df <- data.frame(
    Region = names(correlation_results),
    Correlation = sapply(correlation_results, function(x) x$Correlation),
    P_Value = sapply(correlation_results, function(x) x$P_Value),
    row.names = NULL
)

# Print the results
print(correlation_df)
                     Region Correlation       P_Value
1        L_G&S_frontomargin   0.9441809  6.240770e-53
2       L_G&S_occipital_inf   0.8387737  9.485030e-30
3         L_G&S_paracentral   0.8879535  1.529169e-37
4          L_G&S_subcentral   0.9466024  6.338163e-54
5    L_G&S_transv_frontopol   0.9732963  1.423180e-69
6          L_G&S_cingul-Ant   0.8900339  5.989882e-38
7      L_G&S_cingul-Mid-Ant   0.9657367  6.396050e-64
8     L_G&S_cingul-Mid-Post   0.9682624  1.179832e-65
9    L_G_cingul-Post-dorsal   0.9745212  1.219494e-70
10  L_G_cingul-Post-ventral   0.9874266  9.419893e-87
11               L_G_cuneus   0.9739739  3.708802e-70
12  L_G_front_inf-Opercular   0.9056285  2.748013e-41
13    L_G_front_inf-Orbital   0.9600228  1.956852e-60
14   L_G_front_inf-Triangul   0.9049802  3.882216e-41
15         L_G_front_middle   0.9454341  1.935779e-53
16            L_G_front_sup   0.8948487  6.355426e-39
17    L_G_Ins_lg&S_cent_ins   0.9517895  3.226544e-56
18        L_G_insular_short   0.9557224  3.931883e-58
19     L_G_occipital_middle   0.9371754  2.733551e-50
20        L_G_occipital_sup   0.9780833  4.568502e-74
21  L_G_oc-temp_lat-fusifor   0.8991228  7.903988e-40
22  L_G_oc-temp_med-Lingual   0.9618520  1.714612e-61
23  L_G_oc-temp_med-Parahip   0.9429966  1.840812e-52
24              L_G_orbital   0.9587014  1.059563e-59
25   L_G_pariet_inf-Angular   0.9719775  1.770169e-68
26  L_G_pariet_inf-Supramar   0.9898107  1.450111e-91
27         L_G_parietal_sup   0.9819521  1.708513e-78
28          L_G_postcentral   0.9858249  5.201263e-84
29           L_G_precentral   0.9785582  1.448158e-74
30            L_G_precuneus   0.9817961  2.684862e-78
31               L_G_rectus   0.9892063  3.027240e-90
32          L_G_subcallosal   0.9701152  5.106715e-67
33  L_G_temp_sup-G_T_transv   0.9832625  3.252148e-80
34     L_G_temp_sup-Lateral   0.9763569  2.430148e-72
35  L_G_temp_sup-Plan_polar   0.9785819  1.366686e-74
36  L_G_temp_sup-Plan_tempo   0.9960224 3.790300e-113
37         L_G_temporal_inf   0.9656161  7.680761e-64
38      L_G_temporal_middle   0.9739895  3.594727e-70
39   L_Lat_Fis-ant-Horizont   0.9787217  9.693283e-75
40   L_Lat_Fis-ant-Vertical   0.9918993  8.028713e-97
41           L_Lat_Fis-post   0.9934494 1.078707e-101
42         L_Pole_occipital   0.9369329  3.331233e-50
43          L_Pole_temporal   0.9155653  9.843325e-44
44            L_S_calcarine   0.9793646  1.938492e-75
45              L_S_central   0.9916187  4.844566e-96
46    L_S_cingul-Marginalis   0.9831002  5.400698e-80
47  L_S_circular_insula_ant   0.9769093  7.042333e-73
48  L_S_circular_insula_inf   0.9916912  3.062216e-96
49  L_S_circular_insula_sup   0.9319099  1.694422e-48
50    L_S_collat_transv_ant   0.9823310  5.602706e-79
51   L_S_collat_transv_post   0.9450398  2.806014e-53
52            L_S_front_inf   0.9660544  3.936343e-64
53         L_S_front_middle   0.9735392  8.824335e-70
54            L_S_front_sup   0.9646310  3.345863e-63
55   L_S_interm_prim-Jensen   0.9785678  1.414519e-74
56  L_S_intrapariet&P_trans   0.9779550  6.204550e-74
57    L_S_oc_middle&Lunatus   0.9222308  1.503869e-45
58   L_S_oc_sup&transversal   0.9621160  1.194726e-61
59        L_S_occipital_ant   0.9527094  1.190559e-56
60          L_S_oc-temp_lat   0.9814668  6.885094e-78
61  L_S_oc-temp_med&Lingual   0.9670135  8.833665e-65
62      L_S_orbital_lateral   0.8700354  2.444591e-34
63   L_S_orbital_med-olfact   0.9785790  1.376217e-74
64     L_S_orbital-H_Shaped   0.9707998  1.521968e-67
65    L_S_parieto_occipital   0.9844438  6.930263e-82
66         L_S_pericallosal   0.9477710  2.022773e-54
67          L_S_postcentral   0.9781059  4.327602e-74
68  L_S_precentral-inf-part   0.9695648  1.324280e-66
69  L_S_precentral-sup-part   0.8710740  1.643642e-34
70           L_S_suborbital   0.9908692  4.451610e-94
71          L_S_subparietal   0.9839788  3.261221e-81
72         L_S_temporal_inf   0.9635707  1.557520e-62
73         L_S_temporal_sup   0.9725570  5.937132e-69
74  L_S_temporal_transverse   0.9775971  1.443155e-73
75       R_G&S_frontomargin   0.8891843  8.802778e-38
76      R_G&S_occipital_inf   0.9308154  3.833131e-48
77        R_G&S_paracentral   0.8987629  9.454142e-40
78         R_G&S_subcentral   0.9741369  2.670095e-70
79   R_G&S_transv_frontopol   0.9413515  7.963813e-52
80         R_G&S_cingul-Ant   0.8802103  4.282435e-36
81     R_G&S_cingul-Mid-Ant   0.9818162  2.533406e-78
82    R_G&S_cingul-Mid-Post   0.9439760  7.537901e-53
83   R_G_cingul-Post-dorsal   0.9893147  1.777174e-90
84  R_G_cingul-Post-ventral   0.9723510  8.777566e-69
85               R_G_cuneus   0.9524624  1.559069e-56
86  R_G_front_inf-Opercular   0.9598653  2.400206e-60
87    R_G_front_inf-Orbital   0.9872547  1.926039e-86
88   R_G_front_inf-Triangul   0.9310460  3.230962e-48
89         R_G_front_middle   0.9484056  1.075851e-54
90            R_G_front_sup   0.8595594  1.117939e-32
91    R_G_Ins_lg&S_cent_ins   0.9262081  1.034250e-46
92        R_G_insular_short   0.9658190  5.642899e-64
93     R_G_occipital_middle   0.8790151  7.016927e-36
94        R_G_occipital_sup   0.9530118  8.541252e-57
95  R_G_oc-temp_lat-fusifor   0.9397683  3.134632e-51
96  R_G_oc-temp_med-Lingual   0.8320543  6.855260e-29
97  R_G_oc-temp_med-Parahip   0.9781895  3.541331e-74
98              R_G_orbital   0.9461587  9.713781e-54
99   R_G_pariet_inf-Angular   0.9900036  5.289795e-92
100 R_G_pariet_inf-Supramar   0.9810918  1.971430e-77
101        R_G_parietal_sup   0.9532546  6.531787e-57
102         R_G_postcentral   0.9659177  4.853196e-64
103          R_G_precentral   0.9829091  9.752930e-80
104           R_G_precuneus   0.9893391  1.575822e-90
105              R_G_rectus   0.9670953  7.760994e-65
106         R_G_subcallosal   0.9427736  2.250722e-52
107 R_G_temp_sup-G_T_transv   0.9765764  1.490887e-72
108    R_G_temp_sup-Lateral   0.9744812  1.323875e-70
109 R_G_temp_sup-Plan_polar   0.9364426  4.957588e-50
110 R_G_temp_sup-Plan_tempo   0.9940436 7.093169e-104
111        R_G_temporal_inf   0.9805210  9.397794e-77
112     R_G_temporal_middle   0.9794203  1.682196e-75
113  R_Lat_Fis-ant-Horizont   0.9809646  2.803181e-77
114  R_Lat_Fis-ant-Vertical   0.9382120  1.163111e-50
115          R_Lat_Fis-post   0.9902001  1.856328e-92
116        R_Pole_occipital   0.9867998  1.221581e-85
117         R_Pole_temporal   0.9030309  1.081088e-40
118           R_S_calcarine   0.9850460  8.681774e-83
119             R_S_central   0.9359165  7.568053e-50
120   R_S_cingul-Marginalis   0.9626403  5.786859e-62
121 R_S_circular_insula_ant   0.9595257  3.718551e-60
122 R_S_circular_insula_inf   0.9801970  2.233655e-76
123 R_S_circular_insula_sup   0.9347096  1.970814e-49
124   R_S_collat_transv_ant   0.9946453 2.548284e-106
125  R_S_collat_transv_post   0.9286010  1.920165e-47
126           R_S_front_inf   0.9554797  5.219772e-58
127        R_S_front_middle   0.8714062  1.446581e-34
128           R_S_front_sup   0.9174505  3.127269e-44
129  R_S_interm_prim-Jensen   0.9816800  3.749314e-78
130 R_S_intrapariet&P_trans   0.9580532  2.378127e-59
131   R_S_oc_middle&Lunatus   0.8476294  6.065221e-31
132  R_S_oc_sup&transversal   0.9472525  3.368279e-54
133       R_S_occipital_ant   0.9837397  7.107497e-81
134         R_S_oc-temp_lat   0.9563044  1.979707e-58
135 R_S_oc-temp_med&Lingual   0.9779272  6.627961e-74
136     R_S_orbital_lateral   0.8498671  2.944711e-31
137  R_S_orbital_med-olfact   0.9320631  1.509935e-48
138    R_S_orbital-H_Shaped   0.8323132  6.362609e-29
139   R_S_parieto_occipital   0.9933773 1.923856e-101
140        R_S_pericallosal   0.9759544  5.884867e-72
141         R_S_postcentral   0.9482349  1.275976e-54
142 R_S_precentral-inf-part   0.9651018  1.664798e-63
143 R_S_precentral-sup-part   0.9445751  4.331069e-53
144          R_S_suborbital   0.9651162  1.629301e-63
145         R_S_subparietal   0.9723543  8.723141e-69
146        R_S_temporal_inf   0.9489402  6.281677e-55
147        R_S_temporal_sup   0.9594341  4.181644e-60
148 R_S_temporal_transverse   0.9542736  2.085742e-57
Code
# Convert the data to a suitable format for ggplot
correlation_df$Region <- factor(correlation_df$Region, levels = correlation_df$Region)

# Create a scatter plot for each region showing the correlation values
# Assuming 'merged_data' contains corresponding columns for HBR and BLR scores
p <- ggplot(correlation_df, aes(x = Region, y = Correlation)) +
  geom_point(stat = "identity") +
  geom_smooth(method = "lm", color = "blue") +
  ylim(0.8, 1) +
  labs(title = "Correlation of Deviation Scores Across Brain Regions",
       x = "Brain Region", y = "Correlation Coefficient") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 5.5))

# Print the plot
print(p)
`geom_smooth()` using formula = 'y ~ x'

Code
# Optional: Save the plot as a PNG
ggsave("correlation_plot.png", width = 11, height = 8)
`geom_smooth()` using formula = 'y ~ x'
Code
# Adding a new column for significance level based on p-values
correlation_df$Significance <- ifelse(correlation_df$P_Value < 0.001, '***',
                                      ifelse(correlation_df$P_Value < 0.01, '**',
                                      ifelse(correlation_df$P_Value < 0.05, '*', 'ns')))

# Create the lollipop plot
ggplot(correlation_df, aes(x = reorder(Region, Correlation), y = Correlation, color = Significance)) +
    geom_segment(aes(xend = Region, yend = 0.8), size = 0.25) +  # Create lines from 0 to the correlation value
    geom_point(size = 1) +  # Add points for each correlation value
    scale_color_manual(values = c('***' = 'red', '**' = 'orange', '*' = 'blue', 'ns' = 'grey')) +  # Assign colors based on significance
    labs(
        title = "Correlation and Significance of Brain Regions: Hierarchical vs. Linear Bayesian Regression",
        x = "Brain Region",
        y = "Correlation Coefficient",
        color = "Significance Level"
    ) +
    theme_minimal() +
    theme(
        legend.position = "top",  # Move the legend to the top of the plot
        axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 5.5),  # Improve readability of x-axis labels
        legend.title = element_text(face = "bold")  # Bold the legend title for emphasis
    )

Code
# Optional: Save the plot as a PNG
ggsave("correlation_significance_plot.png", width = 11, height = 8)

Z-Score Statistics Pipeline

Code
# Read in computed deviation scores
ADNI_deviation_scores <- read_csv("ADNI_deviation_scores_BLR.csv") # This dataset contains raw subject FreeSurfer volumetric measures and their respective deviation scores for your TESTING set used in the Normative Modeling

# Use dummy_cols to create dummy variables for the 'BioMarkers' column
ADNI_deviation_scores <- dummy_cols(ADNI_deviation_scores, select_columns = "BioMarkers", remove_first_dummy = TRUE, remove_selected_columns = TRUE)

# Rename data frame for efficient code modeling, this is the data frame that will be used for all other analyses EXCEPT group cortical thickness comparisons
df <- ADNI_deviation_scores

# Omit the 'index' column from the data frame, unnecessary variable
df <- df[, -which(names(df) == "index")]

# Rename biomarker variable
df <- dplyr::rename(df, biomarker = `BioMarkers_A+C+`)

# Specifying the order of the first few columns for easier viewing
desired_order <- c("Subject_ID", "age", "sex", "site", "biomarker")

# Append the remaining column names that are not specified in desired_order
remaining_cols <- setdiff(names(df), desired_order)

# Combine the specified order with the remaining columns
new_order <- c(desired_order, remaining_cols)

# Reorder the columns in df according to new_order
df <- df[, new_order]

# List of columns to exclude from the ROI list
desired_order <- c("Subject_ID", "age", "sex", "site", "biomarker")

# Extract column names, remove those in desired_order, and remove z-score suffixes
ROI <- names(df)[!names(df) %in% desired_order & !grepl("_Z_predict", names(df))]

# Further clean the ROI names to ensure no duplicates from z-score names
ROI <- unique(gsub("_Z_predict", "", ROI))

# Recode biomarker variable
df$biomarker <- as.factor(df$biomarker)
df$biomarker <- gsub("0", "MCI", df$biomarker)
df$biomarker <- gsub("1", "AD", df$biomarker)
df$biomarker <- factor(df$biomarker, levels = c("MCI", "AD")) # explicitly set the reference level

# Recode site variable
df$site <- as.factor(df$site)
df$site <- gsub("1", "GE", df$site)
df$site <- gsub("2", "Philips", df$site)
df$site <- gsub("3", "Siemens", df$site)

# Recode sex variable, 0= females 1= males
df$sex <- factor(df$sex)
df$sex <- gsub("0", "Female", df$sex)
df$sex <- gsub("1", "Male", df$sex)
Code
# Establish outlier threshold
outlier_threshold <- -1.96 # bottom 2.5%, as used in the Verdi et al., 2023 paper

# Apply threshold to z-score data to create dummy columns
df3 <- as.data.frame(ifelse(df[,6:153] < outlier_threshold,1,0))

# Rename all binarized columns to have the suffix "_bin"
df3 <- df3 %>% rename_all(paste0, "_bin")

# Sum the dummy column values to create a total outlier score
df$total_outlier_score <- rowSums(df3)

# Merge the two data frames
df <- cbind(df, df3)

# Remove df3 from your R environment as it's no longer needed
rm(df3)

# Create the box plot with colored outlines and filled points
ggplot(df, aes(x = biomarker, y = total_outlier_score, color = biomarker, fill = biomarker)) +
  geom_boxplot(outlier.shape = NA, fill = NA, size = 1) +  # Draw box plots with no fill, only outlines
  geom_jitter(width = 0.2, size = 2, shape = 21) +  # Add jittered points with the same color
  labs(x = "Biomarker", y = "Total Outlier Score") +
  theme_minimal() +
  theme(legend.position = "right")

Code
# Calculate descriptives of total outlier score grouped by biomarker group
describeBy(df$total_outlier_score, df$biomarker, IQR = T)

 Descriptive statistics by group 
group: MCI
   vars  n mean   sd median trimmed  mad min max range skew kurtosis   se IQR
X1    1 40  5.3 6.79      2       4 2.97   0  28    28 1.59     1.78 1.07   7
------------------------------------------------------------ 
group: AD
   vars  n  mean    sd median trimmed  mad min max range skew kurtosis   se
X1    1 59 11.83 15.46      6     8.9 7.41   0  66    66 1.85     2.86 2.01
    IQR
X1 12.5

Session Information

Code
sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggpubr_0.6.0              reshape2_1.4.4           
 [3] ggeffects_1.7.0           effects_4.2-2            
 [5] carData_3.0-5             bitops_1.0-7             
 [7] caTools_1.18.2            lavaan_0.6-17            
 [9] ggbeeswarm_0.7.2          sjPlot_2.8.16            
[11] fastDummies_1.7.3         lubridate_1.9.3          
[13] forcats_1.0.0             purrr_1.0.2              
[15] tibble_3.2.1              tidyverse_2.0.0          
[17] caret_6.0-94              lattice_0.22-6           
[19] patchwork_1.2.0           gridExtra_2.3            
[21] broom_1.0.6               RPMG_2.2-7               
[23] proxy_0.4-27              plot.matrix_1.6.2        
[25] ggplot2_3.5.1             e1071_1.7-14             
[27] data.table_1.15.4         cowplot_1.1.3            
[29] ggsegDesterieux_1.0.1.002 ggsegExtra_1.5.33.004    
[31] ggseg3d_1.6.3             ggseg_1.6.6              
[33] matrixStats_1.3.0         stringr_1.5.1            
[35] readr_2.1.5               tidyr_1.3.1              
[37] psych_2.4.6.26            kableExtra_1.4.0         
[39] knitr_1.48                dplyr_1.1.4              
[41] pacman_0.5.1             

loaded via a namespace (and not attached):
  [1] splines_4.4.0        R.oo_1.26.0          datawizard_0.11.0   
  [4] hardhat_1.3.1        pROC_1.18.5          rpart_4.1.23        
  [7] lifecycle_1.0.4      rstatix_0.7.2        sf_1.0-16           
 [10] pbmcapply_1.5.1      vroom_1.6.5          globals_0.16.3      
 [13] MASS_7.3-60.2        insight_0.20.1       survey_4.4-2        
 [16] backports_1.5.0      magrittr_2.0.3       plotly_4.10.4       
 [19] rmarkdown_2.27       yaml_2.3.9           rgdal_1.6-7         
 [22] sp_2.1-4             RColorBrewer_1.1-3   minqa_1.2.7         
 [25] DBI_1.2.3            abind_1.4-5          quadprog_1.5-8      
 [28] R.utils_2.12.3       rgl_1.3.1            nnet_7.3-19         
 [31] ipred_0.9-14         lava_1.8.0           listenv_0.9.1       
 [34] terra_1.7-78         units_0.8-5          performance_0.12.0  
 [37] parallelly_1.37.1    svglite_2.1.3        codetools_0.2-20    
 [40] xml2_1.3.6           neurobase_1.32.4     tidyselect_1.2.1    
 [43] raster_3.6-26        farver_2.1.2         effectsize_0.8.9    
 [46] lme4_1.1-35.5        stats4_4.4.0         base64enc_0.1-3     
 [49] PupillometryR_0.0.5  jsonlite_1.8.8       survival_3.5-8      
 [52] iterators_1.0.14     systemfonts_1.1.0    RNifti_1.7.0        
 [55] smoothr_1.0.1        foreach_1.5.2        tools_4.4.0         
 [58] ragg_1.3.2           Rcpp_1.0.12          glue_1.7.0          
 [61] mnormt_2.1.1         prodlim_2023.08.28   Rttf2pt1_1.3.12     
 [64] xfun_0.45            withr_3.0.0          fastmap_1.2.0       
 [67] mitools_2.4          boot_1.3-30          fansi_1.0.6         
 [70] digest_0.6.36        timechange_0.3.0     R6_2.5.1            
 [73] textshaping_0.4.0    colorspace_2.1-0     freesurfer_1.6.10   
 [76] jpeg_0.1-10          R.methodsS3_1.8.2    utf8_1.2.4          
 [79] generics_0.1.3       recipes_1.0.10       class_7.3-22        
 [82] httr_1.4.7           htmlwidgets_1.6.4    parameters_0.22.0   
 [85] ModelMetrics_1.2.2.2 pkgconfig_2.0.3      gtable_0.3.5        
 [88] timeDate_4032.109    htmltools_0.5.8.1    scales_1.3.0        
 [91] snakecase_0.11.1     gower_1.0.1          rstudioapi_0.16.0   
 [94] tzdb_0.4.0           oro.nifti_0.11.4     nloptr_2.1.1        
 [97] nlme_3.1-164         sjlabelled_1.2.0     KernSmooth_2.23-22  
[100] parallel_4.4.0       vipor_0.4.7          extrafont_0.19      
[103] pillar_1.9.0         grid_4.4.0           vctrs_0.6.5         
[106] car_3.1-2            extrafontdb_1.0      beeswarm_0.4.0      
[109] evaluate_0.24.0      pbivnorm_0.6.0       magick_2.8.4        
[112] cli_3.6.3            compiler_4.4.0       crayon_1.5.3        
[115] rlang_1.1.4          ggsignif_0.6.4       future.apply_1.11.2 
[118] labeling_0.4.3       classInt_0.4-10      plyr_1.8.9          
[121] sjmisc_2.8.10        stringi_1.8.4        viridisLite_0.4.2   
[124] stars_0.6-6          munsell_0.5.1        lazyeval_0.2.2      
[127] bayestestR_0.13.2    Matrix_1.7-0         sjstats_0.19.0      
[130] hms_1.1.3            bit64_4.0.5          future_1.33.2       
[133] haven_2.5.4          geomorph_4.0.8       bit_4.0.5           
[136] ape_5.8              RRPP_2.0.3